Multi-Choice Minority Game 
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The generalization of the problem of adaptive competition, known as the minority game, to 
the case of K possible choices for each player is addressed, and applied to a system of interacting 
perceptrons with input and output units of the type of Tf-states Potts-spins. An optimal solution 
of this minority game as well as the dynamic evolution of the adaptive strategies of the players are 
solved analytically for a general K and compared with numerical simulations. 



I. INTRODUCTION 

Considerable progress in the theoretical understanding 
of market phenomena has been achieved by the study of 
the minority game. This prototypical model describes 
a system of agents interacting through a market mech- 
anism The game is based on the idea that the 
behavior of the agents is determined by the economic 
rule of supply and demand. According to this rule, given 
the available options (such as buy/sell), an agent wins 
if he chooses the minority action. The research of this 
game has been focused on cases in which each agent can 
choose between two options using its most efficient strat- 
egy, where the strategies remain unchanged throughout 
the game However, in the real world, many situa- 
tions of interest involve more than two decision options 
as well as agents with dynamic strategies. Making deci- 
sions like where to spend the summer vacation or which 
server to choose while surfing the web (or more generally, 
how to distribute data traffic in computer networks (7j) 
are only two among many common problems with more 
than two options. Therefore, it is tempting to investi- 
gate cases with more than two possible choices provided 
to agents with dynamic strategies. In a recent study of 



uipped with a neu- 
it was shown that 



an extension in which each agent is ecu 
ral network for making his decision 
a certain updating rule of the strategies of the agents im- 
proves the efficiency of the market, which is measured by 
the global profit of the agents. In this paper we gener- 
alize the aforementioned work to a multi-choice minority 
game, namely a game with general K decision states. 

The multi-choice minority game consists of N players 
(agents) and K possible decisions. In each step, each 
one of the players chooses one of the K states, aiming to 
choose the state with the smallest number of agents. For 
example, a situation may arise, in which there are several 
possible roads which lead from place A to place B, and 
each driver who wants to get from A to B chooses one of 
the available roads. Because drivers want to avoid traffic 
jams, they try to choose the least traveled roads, assum- 
ing that all the roads are of the same length. Similarly, 
one usually prefers to go to the bar with the smallest 
number of people in it. Occurring over and over again, 
the minority decisions in these and other similar situa- 



tions generate time series whose term at time t, Xt, has 
an integer value between 1 and K according to the mi- 
nority decision. In the original game, the information 
provided to each player is the history vector of size M, 
whose components are the last M minority states. 

The paper is organized as follows: In section |n] a multi- 
layer neural network and the dynamic evolution of its 
weights are introduced. For the clarity of the rest of the 
paper which is somewhat technical we briefly discuss the 
main findings and results. In section III the reference 



case of players with random strategies is solved analyt- 
ically. In section IV the global profit of the players for 
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the network with optimal strategies (weights) is solved 
analytically in the thermodynamic limit, and shown to 
be superior to a random decision. The analytical results 
are compared with simulations on finite systems. In sec- 
tion |y|, the suggested updating rules for the weights are 
examined analytically and are found to saturate asymp- 
totically the optimal global profit. Finally, section 
devoted to a short summary and an outlook. 



II. THE MODEL 



While many strategies for the multi-choices minor- 
ity game are conceivable, we study the following model 
which uses neural networks: each one of the N players 
is represented by a perceptron of a size M. The weights 
belonging to the ith player are {wij} where j = 1, M. 
All N perceptrons have a common input which consists 
of M components X\, xm, where each one of the com- 
ponents can take one of the K integers, 1, 2, K, with 
equal probability. 

The dynamics are defined by the following steps. In the 
first step, each one of the perceptrons calculates the K 
induced local fields. For instance, the field hi m induced 
by the mth state on player i is defined as the summation 
over all weights belonging to the ith perceptron with in- 
put equal to m: 
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In the second step, each player chooses its state, {<7i}, 
following the maximal induced field: 

<7j = {fei | max hi m = h ikl }. (2) 

m=l,..,K 

where <7j is the output (chosen state) of ith perceptron. 
In the third step, the occupancy of each state is calcu- 
lated: 

JV 

N p = Y, s ° i >P> ( 3 ) 
i=i 

where it is clear that ^2 N p = N. The output min of 
the network is the minority decision 

min — {p | min N m = N p }, (4) 

m— X,..,K 

The game can also be represented by a feedforward 
network M : N : 1 (M input units, N hidden units and 
1 output). All units (input, hidden, output) are rep- 
resented by if -states Potts-spins. The weights {w^} are 
from the input units to the hidden units, and the weights 
from the hidden units to the output are all equal to — 1. 
The dynamics of hidden and output units are similar to 
zero temperature dynamics of Potts-spin systems [pLlUl , 
following the maximal induced field. The free parameters 
in our game are the MN weights, {«%■}, from the input 
to the hidden units. Their values will be determined by 
the strategy adopted by each one of the players. Our lo- 
cal dynamic rules are based on the generalization of the 
on-line Hebbian learning rule for K = 2 || to general 
if-states Potts model with the following updating rule; 

™tj = w ij + -j^{KS Xj , m m - 1) (5) 

where rj is the learning rate and the sign + indicates the 
next time step. Note that all agents use the same rule 
for updating their strategy. 

The score of the game is determined similarly to the 
Ising case. Players belonging to the minority (N m i n play- 
ers) gain Q + , while the other N — N min players gain Q_, 
where Q+ > Q-. Note that in most previous works Q+ 
was chosen to be 1 and Q_ was chosen to be either or 
— 1. The global profit in such cases is 

V = Q.N + (Q+ - Q-)N min . (6) 

It is clear that the maximization of the global profit 
U is equivalent to the maximization of N m i n , which is 
bounded from above by N/K. Note that in the Ising 
case each player belongs either to the minority or to the 
majority, where in the Potts case the situation is more 
complex. The score may depend on the exact values of 
{Ni} (the score decreases with N p ), hence the total profit 
U = U({Ni}). In such a case the maximization of the 
total profit may differ from the maximization of JV m j n , 
and will be discussed briefly in the end of this paper. 



Before we turn to discuss the guideline of the derivation 
of the results, which are more involved than for the Ising 
case, let us present the main results: (a) The score and 
the dynamics are formulated analytically for general K, 
the number of possible decisions. Exact results are ob- 
tained for K < 6 and asymptotically for K — > oo. Results 
for intermediate values of K are obtained from simula- 
tions, (b) A relaxation to the optimal score is achieved 
for small learning rates, (c) Regarding the optimal case, 
the deviation of minority group size from N / K is found 
to be non-monotonic with K . (d) The total score is in- 
dependent of the size of the history (M, the size of the 
input) available for the agents, (e) All agents are using 
the same type of dynamic strategy and gain on average 
(over time) the same profit. Our system does not un- 
dergo a phase transition to a state where the symmetry 
among the agents is broken into losers and winners [^,0 . 
Throughout the investigation of the game we assume that 
the memory size M is larger than the number of players 
N (otherwise the completely symmetric Potts configura- 
tion is geometrically impossible). Albeit, simulations of 
the same dynamic for systems with M < N show even 
better results for the global profit. 

III. THE RANDOM CASE 

In case where the maximization of the global profit U 
is identical to the maximization of N m i n , the quantity of 
interest is 

<t,mn 2 >= < (N mm - N/K) 2 >, (7) 

where the symbol < > indicates an average over in- 
put patterns, and N/K is the average number of players 
in each state. Note that in our calculations the input 
vector presented to the players at each step of the game 
consists of random components [f|]|], instead of the true 
history. Nevertheless, simulations indicate that the sys- 
tem behavior is only slightly affected by the randomness 
of the inputs and the game properties remain similar. 

For random players, each weight (among the MN 
weights {w^}) is chosen from a given unbiased distribu- 
tion and a variance 1/M. Hence, the distribution of the 
overlap R between weights belonging to any two players 
p and 4> 

M 

is a Gaussian with zero mean and variance 1/M. In the 
thermodynamic limit and for M > N, one can show that 
in the leading order the overlap between each pair is an 
independent random variable. For random players and 
K = 2 one finds < e 2 >=< J2 P i N P ~ N/K) 2 /(NK) >= 
1/4; however, for general K even the derivation of a sim- 
ilar quantity is non-trivial. The two cornerstones of the 
calculations below are the probability of a microscopic 
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configuration P({<jj}), and the degeneracy D({N p }) of 
a macroscopic configuration {N p }, which is given by the 
multinomial coefficient: 



Di{N ' }) " nyv' (9) 

In the large N limit, the typical deviation of the size 
of each group from N/ K is expected to scale with \J~N . 
Hence we define: 

N p = N/K + epvT (10) 



where it is clear that J2 P € p — an d without the loss of 
generality we assume N m i n — N\ < N p Vp > 1. Ap- 
plying the Stirling approximation to Eq. (0) yields the 
degeneracy as a function of {e p }: 



D K ({e p }) ~ K N exp(- y £ c 2 )^ (H) 

P =i P =i 
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FIG. 1. Simulations for < e m i n 2 > as a function of 
K for both the optimal case R = — „_ (solid curve) and 
the random case R = (long dashed curve). Analyti- 
cal results up to K — 6 agree with simulations for both 
R — — N *_ (filled circles) and R = (triangles). Inset: 
Tk =< e m in 2 >h=o / < £min 2 > R _ L, as a function of 

N-l 

K. 



IV. THE OPTIMAL CASE 



If the average over R p( p, which we denote by R, is 0, the 
agents make their choice independently and randomly, 
so each microscopic configuration has the same probabil- 



ity P K = (l/K) N . 
evaluated: 



Now the average over <r min can be 



>R= 



ei 2 d £l n p>1 /~ de p D K ({e p })P K ({e p }) 
IL *in p >i /~ de p D K ({e p })P K ({e p }) 

(12) 



The quantity < e, 



>_r=o "was calculated numeri- 



cally for K = 3, 4, 5, 6 and found to be equal to 
- 0.313, 0.322, 0.320, 0.309, respectively (see Fig. 0). 
Results obtained from simulations with N = 5000 and 
K < 6 are in an excellent agreement with Eq. (|l2]). For 
K > 6 the reported results in Fig. [j] were derived only 
from simulations and are in an excellent agreement with 
the asymptotic behavior of Eq. ([l2|), < e^ in >n=a^ 
2\og(K)/K. Another quantity of interest is the average 
deviation of the average number of players in each state 
from N/K, < e 2 >=< ^ V e 2 >. Similarly to Eq. @, 
this quantity can be derived analytically and gives 



< >r= -- 



K - 1 
K 2 ' 



(13) 



So far we have compared < e j; 



> and < e 2 > 



for random players, where the average overlap is zero. 
Without breaking the symmetry among the players, the 
weights can be represented by TV weight vectors which 
are symmetrically spread around their center of mass. 
More precisely, we denote the weight vector of the «th 
perceptron as w,, and assume that it can be expressed 
as 



Wj = C + gi, 



(14) 



where the center of mass C = -k ^ Wj, and {gi} are N 
unit vectors of rank M obeying the symmetry 



it Sj 



= (1 



1 



N — 1 



i 



N - 1 



(15) 



Hence, the total profit and N m i n are functions of only one 
parameter, C. It is clear that the maximization of the 
total profit or N m i n (as for the case K — 2) is obtained 
when C = 0, which is the maximal achievable homoge- 
neous repulsion among N vectors of rank M > N. The 
repulsion is the natural tendency of each player in the mi- 
nority game, since the goal is to act differently than other 
players. Without a cooperation which breaks the players 
into sub-groups, the maximal homogeneous repulsion is 
R = -1/(N- 1). 

The two questions of interest are the following: (a) 
What is < e 2 > and < e m ;„ 2 > as a function of K for 
the optimal solution, C = and R = -1/(N - 1)? (b) 
Is the optimal solution achievable by local dynamic rules 



3 



for each one of the players? We first examine the for- 
mer question regarding the optimal solution, and then 
we turn to study the dynamic behavior of the players. 

The average deviation of the number of players in each 
state from N/K at C = and for R = 0(1 /N) can 
be calculated analytically. The main idea is that this 
quantity can be calculated similarly to Eq. (|l2|), or via 
< e 2 >= 1/(NK) < (Ef =1 ££i <W - N/Kf >. The 
simplification of the later expression is such that an aver- 
age over only a pair of players has to be done. The result 
as a function of K gives 



< e >R= 



K-l 

~k 2 ~ 



+ R(N -1)(K -l)K/j,, 



(16) 



where M = [J^ ^(1 - H(h)f- 2 dh] 2 and H(x) = 
0.5erfc(^). 

Regarding the optimal score, the quantity of a partic- 
ular interest is < e^ in -i . This quantity has to be 

compared with < e 2 nin > r=o in order to estimate the im- 
provement in the average global gain relative to the ran- 
dom case. Note that the calculation of Eq. ( p"2| ) for R ^= 
is nontrivial since Pk({^p\) is no longer independent of 
the configuration {e p }. However, we can overcome this 
difficulty in the following way. For R = 0(1/N) one can 
show that in the leading order Pfj-({e p }) has the same 
form as Dk({c p }): 



K 



K 



P K ({e P }) ~ (l/Kf exp(-A(R) £ e 2 ,)^ e p ), (17) 



where the exact value of A(R) is unknown. The obser- 
vation that both Pr- ({e p }) and fjf({ej) have the same 
dependence on {e p } (Eqs. (TO]) and (|l7[)) indicates that 
the ratio < e 2 lin > / < e > is independent of R if 
R = 0(1 /N), and in particular: 



< e z >R=o 



K ■ 



(18) 



This property can be easily derived by rescaling e p — > 
\J A(R)e p in the integral representation (Eq. (|l2|)) of 
each one of the four terms in Eq. ( |l8| ) . The same prefac- 
tor appears both in the denominator and in the numera- 
tor, and the dependence of (3k on R via A(R) is cancelled 

out. Using Eq. Ji"§|), < e min 2 > R= i can be obtained 

indirectly from the knowledge of the other three terms, 
which are given by Eqs. (|i~2]), ([l3|), and jl^). Results for 
< £mm 2 >r- 1 are presented in Fig. [j]. In order to 
confirm our analytical results we performed simulations 
for the optimal case, Eqs. ( p^| ) and (|i8|), with C = 0. The 
simulations were done in two stages. In the first stage, 
N normalized vectors of rank M, obeying the constraints 
that the overlap among each pair is equal to —1/(N— 1), 
are generated using a recursive process. The details of 
the algorithm will be given elsewhere [O] . In the second 



stage, < e m in 2 > and < e 2 > were averaged over about 
10 5 randomly chosen inputs for a system with TV = 400 
and M = 5000. An excellent agreement between simu- 
lations and analytical results was obtained (see Fig. |l|). 
The improvement in the global gain can be measured by 
the ratio T K =< e min 2 >r=q / < e min 2 > R _^j_. This 

ratio decreases monotonically with K such that its max- 
imal value T2 — 2.7548 and for K — » 00 Tk 1 (inset 
of Fig. 0). 



V. THE DYNAMICS WHICH LEAD TO THE 
OPTIMAL SOLUTION 

So far we derived the properties of the optimal solution 
for different values of K . Now we are turning to the sec- 
ond question: is the optimal solution achievable by local 
dynamic rules (Eq. (||))? After averaging Eq. (||) over 
j and in the limit where the number of examples, aM, 
scales with the number of input units M, one can find 
the following equation of motion for the center of mass 

dC 2 

— = 2 V K<J2 O'U • > +V 2 {K - 1), (19) 

3 

where < > denotes an average over the random ex- 
amples. For large M, in the leading order each input 
vector divides each weight vector into K equal groups of 
size M/K. The minority state is the one who se g roup 
of weights gives the minimal sum. Using Eq. (M) and 
M,N —>■ 00, < ^Cj^.mm > is the average mini- 
mal sum of a set of M/K center of mass components, 
{Cj}. These M/K quantities are random variables with 

zero mean and variance C 2 /M ( < *Y^j=\ >= 



and < (£ 



M/K 



c 4 



>= C 2 /K). One can find that 



< J2j Gj5 Xjymin > is equal to 



I V 7T 



P -y 2 / 2 v 

V^" *■ 



(20) 



Hence, for a given K, Eqs. ( |l9| ) and ( p0| ) indicate a lin- 
ear relation between the fixed point value of C and the 
learning rate 77 with corrections of Oi\/^fN). As rj — > 0, 
C — > and the system approaches the optimal configu- 
ration. The interplay between C and rj was confirmed by 
simulations, where finite size effects decay as the size of 
the system becomes larger. This effect is depicted in Fig. 
H for K = 3. The explicit dependence of < e m in 2 /N >r 
on C can be found for R ~ 0(1 /N) via the relation 



R 



r<2 1 
C 2 + 1 



(21) 



Results of simulations for < e m in 2 >r as a function of C 
for N = 103 and M — 200 are presented in the inset of 
Fig. H An excellent agreement between the analytical 
prediction and simulations was obtained in the regime of 
C - 0(l/y/N) (corresponding to R ~ 0(1/N)). 
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FIG. 2. C as a function of r\ for K = 3. Analytical 
results (solid line) and simulations for N = 103, M = 200 
(long-dashed), and for N = 400, M = 403 (dashed- line). In- 
set: < train 2 > as a function of C for K = 3. Analytical 
results (solid line) and simulations for N = 103, M = 200 
(filled circles). 

Note that although the global gain U which corre- 
sponds to the Boolean case is monotonic with K, the non- 
monotonic behavior of e m j n implies that for non-Boolean 
cases non-monotonic behavior of U may be obtained. 



Secondly, the other strategies for the minority game 
that have been studied can be generalized to multi-choice 
situations in a straightforward manner: in the original 
game where each player has several decision ta- 

bles, each table entry is now a value between 1 and K. 
In Johnson's stochastic strategy |l^Jl^| , each player has 
a probability of choosing the outcome that was success- 
ful the last time, or to pick one of the others with equal 
probability. In the strategy of Reents |L5| , players who 
were not in the minority could switch to some other ac- 
tion with a small probability in the next time step. Sim- 
ilarly, other conceivable strategies can also be general- 
ized. Preliminary checks imply that all these modified 
strategies show similar behavior compared to that of the 
binary-choice game, even though their theoretical treat- 
ment probably becomes more involved. While outcomes 
of these games certainly have to be measured against the 
reference values given in Eqs. (12) and (13), it is not 
clear under what circumstances relations like Eq. (18) 
hold for other strategies. 

I. K., W. K. and R. M. acknowledge a partial support 
by GIF. 



VI. SUMMARY AND OUTLOOK 

In this paper we introduced a generalization of the mi- 
nority game to the case of multi-choice. The problem was 
applied to a multilayer network with updating rules for 
the weights (strategies). Static and dynamic properties 
of the strategies were solved analytically for various K's 
and were found to be in a good agreement with simula- 
tions on finite systems. This modification of the minor- 
ity game to the case of multi-choice open a manifold of 
new questions, which certainly deserve future research. 
We have chosen two of those questions to briefly discuss 
here. 

Firstly, as we have pointed out before, the function ac- 
cording to which the profit is awarded is not necessarily 
Boolean as in Eq. (^J). In fact, the model is more realis- 
tic when the profit of a player is related to the size of his 
group, as well as to the size of the other groups Jl^] . Our 
analysis can be applied to these cases if the maximiza- 
tion of the global gain is equivalent to the maximization 
of the minority group. However, other scores may not 
fulfill this required condition. In these cases, it has to be 
determined whether the optimal symmetric configuration 
remains the maximal repulsion. 



[7; 

[8 

[9 
[10 

[11 
[12 

[13 
[14 

P 



D. Challet and Y. C. Zhang, Physica A 246, 407 (1997). 
D. Challet and Y. C. Zhang, Physica A 256, 514 (1998). 
R. Savit, R. Manuca and R. Riolo, Phys. Rev. Lett. 82, 
2203 (1999). 

A. Cavagna, J. P. Garrahan, I. Giardina and D. Sher- 
rington, Phys. Rev. Lett. 83, 4429 (1999). 
D. Challet, M. Marsili and R. Zecchina, Phys. Rev. Lett, 
(in press). 



M. Marsili, D. Challet and R. Zecchina, cond 



mat/9908480 



D. H. Wolpert, K. Tu rner and J. Frank, electronic 
preprint, fcsLG/9905004 . 

R. Metzler, W. Kinzel and I. Kanter, Phys. Rev. E 62, 
2555 (2000). 

I. Kanter, Phys. Rev. A. 37, 2739, (1988). 

T. L. H. Watkin, A. Rau, D. Bolle and J. van Mourik, J. 

Phys. I France 2, 167, (1992). 

L. Ein-Dor, unpublished. 

Y. Li, A. VanDeemen and R. Savit, Physica A 284, 461 
(2000). 

N. F. Johnson, P. M. Hui, R. Jonson and T. S. Lo, Phys. 
Rev. Lett. 82, 3360 (1999). 

T. S. Lo, P . M. Hui and N. F. Johnson, |cond 
tt/0003379| (2000) 



G. Reents, R. Metzler, W. Kinzel, cond-mat/0007351 
(2000). 



2 



5 



